Feeds to Scour
SubscribedAll
Scoured 18778 posts in 140.4 ms
The three types of LLM workloads and how to serve them
modal.com·7h·
Discuss: Hacker News
📊Model Serving Economics
Preview
Report Post
PLA-Serve: A Prefill-Length-Aware LLM Serving System
arxiv.org·18h
🧠Inference Serving
Preview
Report Post
MIT’s new ‘recursive’ framework lets LLMs process 10 million tokens without context rot
venturebeat.com·1d·
Discuss: r/technews
🦙Ollama
Preview
Report Post
PRIMAL: Processing-In-Memory Based Low-Rank Adaptation for LLM Inference Accelerator
arxiv.org·18h
🧠LLM Inference
Preview
Report Post
Why AI Needs GPUs and TPUs: The Hardware Behind LLMs
blog.bytebytego.com·2d
Hardware Acceleration
Preview
Report Post
Inferencing startup Baseten valued at $5B after new funding round
techzine.eu·15h
📱Edge AI Optimization
Preview
Report Post
Meet the IBM researchers trying to make LLMs smarter
research.ibm.com·10h
🏆LLM Benchmarking
Preview
Report Post
Playing with GPT-3, LangChain, and the OpenAI Embeddings API
shruggingface.com·20h
🦙Ollama
Preview
Report Post
Co-optimization Approaches For Reliable and Efficient AI Acceleration (Peking University et al.)
semiengineering.com·6h
Hardware Acceleration
Preview
Report Post
Arctic Wolf’s Liquid Clustering Architecture Tuned for Petabyte Scale
databricks.com·5h
ClickHouse
Preview
Report Post
Evolution of LLMs use by a programmer
asfaload.com·6h·
Discuss: Hacker News
🪄Prompt Engineering
Preview
Report Post
Building scalable agentic assistants: A graph-based approach
thenewstack.io·5h
🌐Distributed systems
Preview
Report Post
Using Local LLMs to Discover High-Performance Algorithms
towardsdatascience.com·2d
🕯️Candle
Preview
Report Post
From 75% to 99.6%: The Math of LLM Ensembles
shibaprasadb.com·15h·
Discuss: Hacker News
🏆LLM Benchmarking
Preview
Report Post
Setting Up A Cluster of Tiny PCs For Parallel Computing - A Note To Myself
kenkoonwong.com·4h·
Discuss: Hacker News
🚀Async Optimization
Preview
Report Post
Streamlining CUB with a Single-Call API
developer.nvidia.com·2h
🏟️Arena Allocators
Preview
Report Post
How I Rebuilt a RAG System that Actually Works
pub.towardsai.net·18h
🔄LLM RAG Pipelines
Preview
Report Post
High-performance LLM inference
modal.com·6d
🧠LLM Inference
Preview
Report Post
Artificial Intelligence
radiofreemobile.com·16h
🆕New AI
Preview
Report Post
Learning from Models
rodney.bearblog.dev·1d
🔍AI Interpretability
Preview
Report Post

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help